Load-Balanced Isosurfacing on Multi-GPU Clusters
نویسندگان
چکیده
Isosurface extraction is a common technique applied in scientific visualization. Increasing sizes of volumes over which isosurfacing is to be applied combined with increasingly hierarchical parallel architectures present challenges for efficiently distributing isosurfacing work loads. We propose a technique that, with a modest amount of preprocessing, efficiently distributes isosurfacing load to GPU compute resources within a cluster. Load uniformity is maximized over a set of user-defined isovalues, enabling improved scalability over naive, non-data-centric, work distribution approaches.
منابع مشابه
A Static Load Balancing Scheme for Parallel Volume Rendering on Multi-GPU Clusters
GPU-based clusters are an attractive option for parallel volume rendering. One of the key issues in parallel volume rendering is load balancing, keeping a balanced workload per node is essential for improving performance. A good number of dynamic load balancing schemes have been proposed throughout the years. However, most of these approaches require runtime dynamic data movement or data duplic...
متن کاملA flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU-CPU clusters
Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing...
متن کاملHierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters
Hierarchical level of heterogeneity exists in many modern high performance clusters in the form of heterogeneity between computing nodes, and within a node with the addition of specialized accelerators, such as GPUs. To achieve high performance of scientific applications on these platforms it is necessary to perform load balancing. In this paper we present a hierarchical matrix partitioning alg...
متن کاملA Distributed Multi-GPU System for Fast Graph Processing
We present Lux, a distributed multi-GPU system that achieves fast graph processing by exploiting the aggregate memory bandwidth of multiple GPUs and taking advantage of locality in the memory hierarchy of multi-GPU clusters. Lux provides two execution models that optimize algorithmic efficiency and enable important GPU optimizations, respectively. Lux also uses a novel dynamic load balancing st...
متن کاملLoad-Balanced Multi-GPU Ambient Occlusion for Direct Volume Rendering
Ambient occlusion techniques were introduced to improve data comprehension by bringing soft fading shadows to the visualization of 3D datasets. They consist in attenuating light by considering the occlusion resulting from the presence of neighboring structures. Nevertheless they often come with an important precomputation cost, which prevents their use in interactive applications based on trans...
متن کامل